Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory

نویسندگان

چکیده

Optical Character Recognition (OCR) is a system of converting images, including text,into editable text and applied to various languages such as English, Arabic, Persian. While these have similarities, their fundamental differences can create unique challenges. In Persian, continuity between Characters, the existence semicircles, dots, oblique, left-to-right characters English words in context are some most important challenges designing Persian OCR systems. Our proposed framework, Bina, designed special way address issue by utilizing Convolution Neural Network (CNN) deep bidirectional Long-Short Term Memory (BLSTM), type LSTM networks that has access both past future context. A huge diverse dataset, about 2M samples contexts,consisting fonts sizes, also generated train test performance model. Various configurations tested find optimal structure CNN BLSTM. The results show Bina successfully outperformed state art baseline algorithm achieving 96% accuracy 88% contexts.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mobile Gesture Recognition using Hierarchical Recurrent Neural Network with Bidirectional Long Short-Term Memory

As the sensors embedded to a smartphone are proliferating, many application systems for context-aware services are actively investigated. This paper proposes a gesture recognition system with smartphones for better interface. It is important to maintain high accuracy even with the large number of gestures. To improve the accuracy, we adopt the recurrent neural network based on hierarchical BLST...

متن کامل

Improving protein disorder prediction by deep bidirectional long short-term memory recurrent neural networks

Motivation Capturing long-range interactions between structural but not sequence neighbors of proteins is a long-standing challenging problem in bioinformatics. Recently, long short-term memory (LSTM) networks have significantly improved the accuracy of speech and image classification problems by remembering useful past information in long sequential events. Here, we have implemented deep bidir...

متن کامل

Modelling Radiological Language with Bidirectional Long Short-Term Memory Networks

Motivated by the need to automate medical information extraction from free-text radiological reports, we present a bi-directional long short-term memory (BiLSTM) neural network architecture for modelling radiological language. The model has been used to address two NLP tasks: medical named-entity recognition (NER) and negation detection. We investigate whether learning several types of word emb...

متن کامل

Bidirectional Long Short-Term Memory Networks for Relation Classification

Relation classification is an important semantic processing, which has achieved great attention in recent years. The main challenge is the fact that important information can appear at any position in the sentence. Therefore, we propose bidirectional long short-term memory networks (BLSTM) to model the sentence with complete, sequential information about all words. At the same time, we also use...

متن کامل

A Novel Approach to On-Line Handwriting Recognition Based on Bidirectional Long Short-Term Memory Networks

In this paper we introduce a new connectionist approach to on-line handwriting recognition and address in particular the problem of recognizing handwritten whiteboard notes. The approach uses a bidirectional recurrent neural network with long short-term memory blocks. We use a recently introduced objective function, known as Connectionist Temporal Classification (CTC), that directly trains the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied sciences

سال: 2022

ISSN: ['2076-3417']

DOI: https://doi.org/10.3390/app122211760